Growing Dynamics of Internet Providers. 
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In this paper we present a model for the growth and evolution of Internet providers. The model 
reproduces the data observed for the Internet connection as probed by tracing routes from different 
computers. This problem represents a paramount case of study for growth processes in general, but 
can also help in the understanding the properties of the Net. Our main result is that this network 
can be reproduced by a self-organized interaction between users and providers that can rearrange 
in time. This model can then be considered as a prototype model for the class of phenomena of 
aggregation process in social networks. 



Networks are systems composed by elementary units, 
the nodes, connected by directed or undirected links. The 
number of links pointing to a node, k, is known as the 
degree of the node, whose distribution gives the network 
connectivity. This simple structure is almost ubiquitu- 
ous in Nature, and the reason of such a success is often 
linked to the optimization of some cost function. For ex- 
ample, in all transport processes networks are selected to 
efficiently distribute the quantities of interest among the 
sites connected. Networks could also be used to describe 
both the spreading of information or diseases |l| and 
physical structures as, for example, river basins P|, bi- 
ological distribution networks (vascular systems) H and 
some properties of the hardware layout of Internet . 
A detailed discussion of such networks and some models 
are described in Ref. 

Here, we present some experimental measures of the 
network of Internet providers and we propose a simpli- 
fied model in order to explain them. It is worth to note 
that this network does not correspond to the one com- 
posed by the web pages. This network is composed by 
the physical connections of the computers and the mea- 
sures come from the analysis of the data provided by the 
Internet Mapping Project Q Hereafter we are going to 
discuss only this particular system and we do not want to 
describe neither web pages network nor other social sys- 
tems. Recently some statistical properties of the con- 
nectivity of this physical network of Internet have been 
investigated. For such a system, a tree-like structure has 
been found by checking the routers connections from a 
starting point. Despite the bias introduced by observing 
the Net from a single node, some statistical feature can be 
established, as the power law distribution of the degree. 
Here, instead we are focusing on the possible dynam- 
ics behind the formation of such a structure. The main 
results from the data analysis is the power-law distribu- 
tion of site degree showing the absence of a particular 
scale. It would be tempting then, to assume that such 
scale- free distribution has been originated by some sort of 
optimization of the supply present in the providers mar- 



ket. This is the main idea inspiring our dynamical model 
that should mimic the evolution of a system of users and 
providers. The model we propose here is in close rela- 
tionship with a prototype growth model introduced by 
Simon |^ and recently improved 1^,^ in order to explain 
the widespread occurrence of fractal behaviour in several 
cases ranging from the web-pages statistics [ pX)|Jll| to sci- 
entific citation |]l2| actors in the same movie cast [[l3|-p^ . 
Some of the networks considered displays these scale-free 
properties, as a result of some optimization, as, for ex- 
ample, for the blood vessels |^ or the river basins |7j. 
In others, a "Small World" phenomenon arises ||], and 
through suitable shortcuts all the points are connected 
one each other in few steps. Together with the numeri- 
cal analysis upon real social networks, a strong effort is 
provided by the physicists' community to find suitable 
theoretical models for such systems. 

The set of data is obtained through a computer instruc- 
tion that allows to trace the route from one terminal to 
any allowed address in the Internet domain. The UNIX 
command traceroute records all the nodes through which 
the target is reached from the starting point where the 
command is run. These paths can change over time for 
the following reasons. Firstly the routes reconfigure since 
the path is variable according to the traffic at the mo- 
ment or more generally according to the availability of 
the connection. Secondly the whole structure is physi- 
cally evolving due to the new connections that take place. 
Nevertheless the main statistical properties of this struc- 
ture remain constant in time even if the total number 
of connections increases. These data can be put in a 
tree-like structure such that providers are organized in 
levels: the main providers on the top level are linked to 
secondary providers, that provide the connection to suc- 
cessive levels down to the common user level. The degree 
of the providers can now be computed over all the levels 
of the network. The main result is that the Probability 
Density Function (PDF) to find a node with degree k 
scales following a power law (see Fig.|l|) where the expo- 
nent 7 is equal to 2.2. 
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P{k) = k- 



(1) 



Since a similar value, is also known to describe the power- 
law distribution of links in web pages, it is possible that 
a similar evolution holds for both of them. 

In particular, we propose a mechanism that describes 
the development of the connections between two subse- 
quent levels in a network. In our model, two different 
classes of nodes are present, representing providers and 
users (that, possibly, could act as providers for a lower 
level of users). Sites representing providers can have sev- 
eral links, pointing to other sites corresponding to users. 
Users, on the other hand, have a single link, pointing to 
their provider. They are not allowed to have more than 
one provider. By iterating this microscopical interaction 
level by level one could, in principle, recover the whole 
tree-structure of the network. At each time step, a node 
is added to the network. The new node can be either 
a provider with a probability r or a user with proba- 
bility 1 — r. When a provider is added, D{t) users in 
the network are chosen at random, and rewired to the 
new provider. Links to the previous providers are then 
removed. We assume that the integer number D{t) is 
a random variable with Poisson distribution and mean 
value d. This aims to mimic the fact that a real provider 
decides to enter the network when it expects to acquire 
a certain number of connected users, d on average, ac- 
cording to some microeconomical optimization rule. The 
randomness of D{t) takes into account inexact forecasts 
about the number of rewiring users. 

This addition of a provider does not change the total 
number of links in the network. Instead, when a user 
is added it is linked through a new link to an existing 
provider. Then, the addition of a user increases by one 
the total number of links. The probability that a provider 
acquires a new user is proportional to its degree, that is, 
the number of users it is linked to. This rule known 
as "richer gets richer condition" is at the basis of the 
typical behaviour observed in scale-free networks Pp4) 
differently from the features shown by ordinary random 
graph . We call ki (t) the degree of the i-th provider 
(introduced at time U) and 



X(t)=^fc.(t) 



(2) 



the total number of links (and users) in the network, at 
time t. A user is added at a rate (1 — r) per time step and 
is connected to a provider with probability proportional 
to its degree. Then, the i-th provider acquires a new 
link with a probability (1 — r)ki/K. A provider is added 
with probability r at each time step. Each user has the 
same probability l/X to be rewired to the new provider. 
Thus, a provider with degree ki loses I users (/ > 0) with 
"binomial" probability Pc{l) 



Pc{l) = 



m 
I 



{h/K)\l-h/K)''^*^-\ (3) 



whose mean value is D{t)ki/K. The degree of a provider 
does not change with the remaining probability (1— r)(l — 
ki/ K) + r(l — ki/K)-^^*\ Since new links are created at 
rate (1 — r) per time step, the number of links at time 
t is K{t) = (1 — r)t, for large t values. Thus, one can 
compute the time evolution of the average connectivity 
ki{t) over many realizations of the model. To do that, 
we assume that the correlation between k{t) and D{t) 
can be neglected and that the two average can be taken 
independently and D{t) be replaced by d in the mean 
value of eq. (||): 



hit + 1) = k,{t) + {1 - r) 



hit) hit) 

Kit) ' Kit) 



(4) 



where the second term in the right hand side of this equa- 
tion corresponds to the addition of a new user, and the 
third term corresponds to the subtraction of links after 
the birth of a new provider. This equation can be written 
in the continuous limit as 



dh _ l-(rf+l)r — 
dt ^ il -r)t " 

This, with the boundary condition hiU) = d, gives 

t 



hit)^d 



where 



1 - id+l)r 



(5) 



(6) 



(7) 



One can see that hit) /Kit) goes to as t goes to oo 
for all z, showing that no node grows in degree as fast as 
the whole network. The stability of the network is then 
assured. The dynamical behavior described in equation 
(^ is in good agreement with the numerical simulations 
of the model, whose dynamical properties are shown in 
the inset of Fig.|l| for a single provider. By means of this 
relation between time and degree, we can now compute 
the probability that a provider has a degree less than fc, 
Pih < k). We assume that Pih < fc) ~ Pih < k). 
Solving equation (0) for ti, one can see that 



hit) < k 



t 



> 



(8) 



This means that providers with a degree less than a given 
value k are those ones which have been added to the 
network after a corresponding time, and have not had 
time enough to develop a cluster of k users around them. 
Since nodes are added at a uniform rate. 



Pih < /c) ~ Pit,/t > (A:/d)"T^wrF) 



(9) 



' i-(d+i),- 
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We can then write for the PDF p{k) 
which yields 



dP{k, < k)/dk, 



(10) 



where 



7 = 



2- (d + 2)r 
1 - (d+ l)r' 



(11) 



[n] provides us with an upper bound on r, since one can 
see that the exponent diverges at r = Numerical 
simulations, as shown in Fig.|], follow the predicted be- 
havior. We recall here that the external parameters r 
and d are estimated by statistical surveys of the Inter- 
net. Through the traceroute procedure one can describe 
the connection to the outlet by means of a tree- like struc- 
ture. However, the iteration of this procedure does not 
show the whole structure of a given region of Internet, 
since cross-links between sites at the same distance are 
not seen. Yet, some statistical property of the consid- 
ered network can still be established. We assume that 
traceroute shows only a given fraction /i < 1 of the real 
number k of links pointing to a site. This, however, does 
not affect the shape of the distribution (if it is a scale- 
free one) and the reliability of our statistical survey. We 
then call fee// = A'fc the apparent degree of a site. By the 
traceroute picture of the physical network of Internet, the 
degree distribution density q{keff) shows a power- law be- 
havior 



q{keff) - k 



(12) 



where a ~ 2.2. For the considerations made above, the 
exponent found by the traceroute analysis is a good ap- 
proximation of the real value. This value is slightly dif- 
ferent from the 2.48 recovered by the analysis of Ref. [Q. 
We believe that this difference arises mainly from the 
growth of the Internet, (that is now very different from 
that at the time). This enables us to write 7 ~ 2.2. The 
connectivity (fc) cannot be computed by traceroute^ since 
the fraction fj, is unknown. Nevertheless, (k) is provided 
in other published statistical analysis |l^, according to 
which the connectivity is 3.4. Nevertheless, we checked 
that for the first layers ifo the data analyzed the mea- 
sured value is not that far from the above one. We also 
notice a decrease of the connectivity with the distance 
from the source of traceroute. We decide here to focus 
on the first levels that can be effectively probed by this 
analysis. If we assume that our model describes the way 
a network is built at each level, the predicted value for 
(k) is 



(fc) = 1 + 



l-r 



(13) 



The unity in the right hand side takes into account that 
each site has a provider and a link that points to it, this 



is not considered in our model and must be explicitly 
added. The second term is the ratio between users and 
providers in our model. 

This equation provides us with the value of r 2± 0.29. 
Replacing this value into equation ( pi] ) one can recover 
the value of d ~ 0.41. Such a value of d, smaller than 1, 
shows that our model describes the real structure of In- 
ternet when some provider is introduced without rewiring 
any user, as it is suggested by the third term in the right 
hand side of equation (|^) . If a new providers is born and 
no user get rewired, the provider is sentenced to death, 
since a provider without users cannot survive. 

Until now computation has been done in the limit 
hypothesis of connection between users and only one 
provider. One can study through numerical simulation 
the behaviour when users are allowed to be linked with 
different users. In this case, when a provider is added, 
users rewired to it keep their old provider connection. 

In our model, the possibility to be connected to several 
users corresponds to neglecting the third term in equa- 
tion (^), which takes into account the probability for a 
provider to lose a user due to a newborn provider, and 
replacing K{t), the total number of links, by 



K{t) = [l + r{d-l)]t, 



(14) 



since now one link is added when a user is added, and 
d links, on average, when a provider is added. Perform- 
ing the same computation as above, one would expect to 
obtain a scale-free degree distribution, with an exponent 



2-f r(d-2) 
l-r 



(15) 



This behavior is confirmed by simulations, as can be seen 
in Fig.^. In addition, we simulated the case in which 
providers merge at a uniform rate. We assume that at 
each time step, providers are added at a rate r, with a 
probability / < r a randomly chosen provider vanishes 
and users connected to it are rewired to another provider, 
according to the "richer-gets-richer" rule, and users are 
introduced with probability u = \ — r — f . The assump- 
tion made on / is needed to avoid the extinction of almost 
all provider as each merging decreases by 1 the number of 
provider. If the merging rate is higher than the birth rate 
of new providers the number of providers rapidly tends 
to 1. As well as in the previous versions of the model, 
this growing network displays a scale-free distribution of 
degree. This result is shown in the lower part of Fig||, 
where we plotted the degree distribution for different val- 
ues of r and constant /. This scale free behaviour charac- 
teristic of "social" networks, has been recently explained 
]l^ by means of two ingredients: firstly, the number of 
nodes has to grow in time and secondly the nodes with 
greater degree are advantaged in acquiring new links. 
This model gives an exponent 7 = 3, while the real expo- 
nents found in the social networks considered above are 
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in the range between 2 and 3. Since then, other models of 
growing networks have been proposed, whose degree dis- 
tribution are closer to the real ones in the corresponding 
real network 21|. The model we introduced describes 
the dynamical development of a network composed by 
two classes of nodes, as it is the case in the Internet 
connections between providers and users. In fact, the 
physical structure of Internet is made of superposed lev- 
els of nodes, corresponding to providers, subproviders, or 
users at the lowest level, whose distribution of degree has 
been recently found to show a power law behavior. The 
model exhibits the same scale-free shape depending on 
the external parameters r, i.e. the providers fraction in 
the total number of nodes, and d, the average number of 
users who join a new born provider. The parameters can 
be naturally tuned to realistic values to recover the exact 
exponent of the tail of the distribution of the degree. 
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FIG. 1. The degree distribution in the model with r = 0.29 
and d — 0.41 and in the real Internet data collected on differ- 
ent days. In the inset there is the temporal behavior of the 
degree of a provider. 




FIG. 2. (above) Degree distribution for different values of 
the providers birth rate r allowing users to have more than 
one provider, (below) Degree distribution with different user 
birth rates u = 1 ~ r and probability of merging /. 
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